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ABSTRACT 

An instrument, "The Assessment of Cognitive Transfer 
in Science Inventory,” was designed to evaluate cognitive performance 
in biology.. The instrument is based on students* verbal responses to 
a structured seguence of situations and questions. Items were 
classified in terms of a modification of Bloom's Taxonomy of 
Educational Objectives. The instrument was tested on students sampled 
to represent a cross-section of biology instruction. Analysis of the 
data includes tests of independence of the classification categories. 
Estimations are given of item difficulty, discrimination, and test 
reliability. (EB) 
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The Development and Inqplenientation of an Ihstrument to Assess 
Cognitive Perfomance in High School Biology' 

As means or as ends the goals of secondarj science instruction 
suggest increased en^hasis on developing cognitive skills^ as de- 
fined by Bloom (2), above knowledge level. The proponents of 
said goals would agree with Kochendorfer's (12) premise that the 
ultimate test of any new "curriculum" is the extent to which it 
meets its desired goals. 

Research at the operational level that demonstrates the effec- 
tiveness of such programs and id^tifies the factors associated 
with goal attainment, i.e., dependent variables, have taken a host 
of fonns and reflect considerable heterogeneity in design and scope. 
Ramsey and Howe (lli) discussed the problems encountered in such re- 
search. They provide a classification scheme that reflects the 
status of research in science education. Noticeably absent are 
entries that atten^t to elucidate the transfer of training and 
leaming. ^arcely represented in the research literature, are 
schemes which assess student achievement at the higher cognitive 
levels such as analysis, synthesis and evaluation, using evaluation 
instruments designed to measure student perfoxmance on tasks related 
in content, process, and strategy to the new courses. ' 



statement of the Problem and Purpose of the Study 



Ehnis (7) and others (13, 18) recognized the need for critical 
thinking tests in various subject matter areas* It is notorious in 
his words that "some are good critical thinkers in one area and not 
in other areas". Therefore, critical thinking to some extent (the 
extent not definable) is specific to the field in which it takes 
place. How then can one justify using measures like the Watson- 
Glaser Critical Thinking Appraisal (19 ) which do not refer to a 
pacific science discipline or, for that matter, are not restricted 
to science? With such instruments serving as criterion measures we 
may well be assessing gains mediated by social studies or Ehglish 
classes or possibly television commercials. 

The operational definition for critical thinking as used in 
this study is as follows: If we state explicit behaviors we expect 

of students as exemplars of the processes of science and at the 
same time state minimum levels of performance by cognitive level, 
students will be thinking critically if these behaviors are exhibi- 
ted. In other words critical thinking cannot be separated from 
the act of cognition, if the act is one of analysis, synthesis, or 
evaluation. Those situations which capitalize on the structure and 
content of antecedent tasks or conceptualizations based on past 
e^qperience will provide the best setting for defining or delineating 
such behaviors. An instrument designed to measure critical thinking 



in a specific science area would be contingent on the continuation 
of the premise that each operation be based on the structure and 
content of antecedent tasks* 

This investigator further submits that one cannot measure such 
entities as critical thinking or strategies of discovery within a 
discipline without designing instruments that measure other aspects 
of performance^ specifically performance in thinking cognitively. 

For this reason the above definition of critical thinking, as 
applied herein, is contingent upon direct measures of cognition. 

The total learning pi^ocess should be reflected in such proposed 
evaluation instruments. 

If the Taxonomy of Educational Ob.iectives; Cognitive Domain (2) 
is directly based on learning theory and the psychological processes 
involved in learning, it, or schemes developed from it, should prove 
valid tools for determining relationships among similar learning pro- 
cesses and between teaching methodology and Reaming process. As 
stressed by Bloom ( 3 ), future research which makes use of the taxonomy 
may reveal psychological relations among the different classes of ob- 
jectives and the extent to which transfer and retention differ among 
the major types of objectives. 

A corollary to the above definition might be that groups of 
operations be preceded ly a visual stimulus, i.e., situations. Hiese 
could serve as foci for minimizing commxmication barriers and as ref- 
erents to the structural integrity of the instrument. 
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Based on the aforementioned statement of the problem, it is the 
puipose of this study to; 1) Develop an evaluation instrument for 
the discipline of biology relevant to secondary school instruction; 
2) Describe the inherent qualities of the instrument and identify 
uses in psychological studies; 3) Classify selected operations by 
taxonomic category using a scheme analogous to that outlined in the 
Taxonomy of Educational Ob.iectives; Cognitive Pom^iTi j U) Adminis- 
ter the instrument to samples of students representing diversity of 
biology instruction and concomitantly assess performance at each 
represented taxonomic level; Conpare performance by taxonomic 
level as defined above, by background in biology, and by achieve- 
ment level on a recognized criterion measure of* critical thirJcing 
ability. 



Methods and Procedures 
Instrumentation 

Based on the above rationale the Assessment of Cognitive Transfer— 
An Eyaluation Instrument for Secondary Biology Teaching was developed. 

It was written using a branching program fonnat centered around nine 
structurally related biological situations. Each frame required a 
verbal response by the student. Ihe criteria reflected in the major 
frames of the program served as the basis for the forty- two item 
Assessment of Cognitive Transfer in Science Inventory . Some of the 
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behaviors measured included: observations relevant to stated 

- h^^otheses or research designs, generation of hypotheses given 
imderlying assun;>tions, designing experiments, recalling concepts, 
predicting results, explaining phenomena and discovering relation- 
ships based on observations* To satisfy the items in the ACTS 
Inventory students had to elicit the minimum perfoimance set for 
each criterion. 

Each item in the ACTS Inventory was classified by cognitive cat- 
egory using a scheme unique to science but patterned after the Taxon- 
cmy of Educational Objectives. Handbook It Cognitive Domain (2). 

Bie knovxledge level was extracted from the "BSCS Grid for Test Analysis" 
by KUnkman (U). Ihe higher levels designated as the processes of 
science represented segments of the scheme developed by Brown (U). 

Hie categories of cognition represented in the inventory were: know- 

ledge, application, collection of data, analysis of data, withholds 
Judgment, synthesis and evaluation as arranged in order of their 
assumed hierarchy. 

The experimenter served as the sole arbiter in classifying each 
iim in the ACTS Inventory by cognitive category. Table 1 contains 
the distribution of items by category of cognition for each of the 
h2 items in the inventory. 

Sampling Procedure 

Selection of students for the'sairple was based primarily on the 
BSCS Attitude Inventory by Blankenship (l) and the Biology Classroom 
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TABLE 1 

DISTRIBBTIOil OF ACTS ITEMS 
BY COGNITIVE CATEGORY 







C-?terory*Ji- 


Item Number 



1. 


Knovjledge 


' 6, 7, 12, 31 


2. 


Application 


1, 2, 3, 5, 11, 15, 16, 19, 20, 30, 31* 


3. 


Collection of Data 


8, U*, 17, 22, 25, 32, 1*0 




Analysis of Data 


13, 21, 26, 29, 37,- 39 


5a. 


Synthesis of Data 


CO 

CO 

CVJ 

•> 

o 

rH 


5B. 


Withholds Judgment 


1*, 18, 23, 27, 35 


6. 


Evaluation of Data 


5, 21*, 36, 1*2 



-K-Descriptions of categories are given in Appendix D. 



Activity Checklist by Kochendorfer (12). Hie popiaation jobs limited 
to 1968-69 biology students of teachers enrolled in a BSCS Summer 
Institute at the University of lovra during 1969 • Students from four 
teachers were selected at random for interviev/ing. The students had 
BSCS or non-BSCS backgrounds as determined by teacher scores on BSCS 



Attitude Inventory and mean cou^iosite rating on BCAC . ]ji addition 
one sairqple was selected from a group taking BSCS biology. The classes 
randomly san 5 >led are characterized by the data in tables 2 and 3* 



For the purpose of sairoling only those teachers from lovja schools 
were considered. Subsequently, all schools involved administer the 
lovja Test of Educational Development (lO). One subtest in this instru 
ment served as a criterion measure of critical thinking for comparing 
perfomtance on the ACTS Inventory . This subtest. Test 6; Ability to 
Interpret Reading I^terials in the Natural Sciences , was described in 



the ITED manual for teachers and counselors: 



. Tests 5, 6, and 7 measure the ability to interpret 
reading materials in the social studies, the natural sci- 
ences, and literature, 'i^lhile constructed in the external 
foim of a reading comprehension test, these three tests are 
designed to measure much more than generalized reading 
skills. Essentially, they are intended to measure the pu- 
pil *s ability to do critical thinlcing in the broad areas 
designated. They are concerned not so much with vjhat the 
pupil has learned, in the sense of specific infoimation, 
but rather with how well he can use whatever he has learned 
in acquiring, interpreting and evaluating new ideas, in re- 
lating new ideas to old, and in applying broad concepts and 
generalizations to nev: situations or to the solution of 
problems . . .” (9) 
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TABLE 2 



CLASSIFICATION OF SAIaPLUJG GROUPS 





High 

Attitude Inventory 


Low 

Attitude Inventory 


SCAC 


SCAC 


BSCS 


Class A 




Non-BSCS 


Class B 


Classes C, D 


One Month BSCS 


Class E* 





*No SC AC scores available. 



TABLE 3 



CHARACTERISTICS OF SAI-IPLE GROUPS BASED ON AVAILABLE DATA 



Sample 

Designation 


SCAC 

(Mean) 




WG 


TOUS 


Highest 

Degree 


Year of 
Highest 
Degree 


Years 

Teaching 

Experience 


A 


33.9 


2h 


73 


U7 


MA 


1967 


6 


B 


30.0 


27 


66 


U9 


MA 


1967 


7 


C 


27.8 


21 


63 


h5 


BA 


1966 


3 


D 


26.9 


17 


58 


37 


BS 


1965 


3 



E (Same teacher as Sample Group A, no student data 

currently available.) 



SCAC = Science Classroom Activities Checklist (12) 

AI = Attitude Inventory (1) 

WG = V/atson-Glaser Critical Thinking Appraisal (19) 
TOUS = 



Test On Understanding Science ( 5 ) 
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Statement of Null hypotheses 

Null hypothesis for Independence of Classes With Respect to ITED 
Test 6, 

1) The five classes do not differ with respect to the fre- 
quency of students in high^ middle, and low levels of 
perfoimance for ITSD Test 6 . 

Null Hypotheses for Independence VJith Respect to Total ACTS In- 
ventory Scores . 

2) Hie five classes do not differ with respect to the fre- ■ 
quency of students in the high, middle, and low levels of 
perf oiroance for the total ACTS Inventory . 

3) Performance of students by level for ITSD Test 6 does not 
differ with respect to performance of students by level 
for the total ACTS Inventory . 

Null Hypotheses for Independence for Acts Inventory Scores by 
Cognitive Category * 

li) The five classes do not differ with respect to perfonnance 
by level for Category 1 (J&ioifledge) of the ACTS Inventory . 

5) The five classes do not differ with respect to nerfonnance 
by level for Category 2 (implication) of the ACTS Inventoir . 

6) The five classes do not differ with respect to perfomance 

by level for Category 3 (Collection of Data) of the ACTS 
Inventory * 

7) The five classes do not differ with respect to perfomance 

by level for Category h (Analysis of Data) of the ACTS 
Inventory . 

8) The five classes do not differ with respect to performance 

by level for Category 5A (S^thesis of Data) of the ACTS 
Inventory . 

9) The five classes do not differ with respect to performance 

by level for Category (Withholds Judgment) of the ACTS 
Inventory . 
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10) The five classes do not differ with i-espect to performance 
by level for Category 6 (Evaluation of Data) of the ACTS 
Inventory * 

11) The perfoimance of students on the ACTS Inventory by levels 
does not differ idien two contiguous categories are coupared 
with other pairs of contiguous categories. 

12) The perfoimance of students on the ACTS Inventory by levels 
does not differ iriien one cogn3.tive category is compared 
with the remaining cognitive categories. 



Results and Interpretations 

For empirical reasons reflected in the nature of this study and 
since the conditions for parametric tests could not be satisfied, non- 
parametric procedures were enployed. The Chi-Square test for k indepen- 
dent saijples (15, pp 17^-179) was used to test most of the above null 
hypotheses. This test requires that the expected frequencies in each 
cell not be too small (l5> P« .179). According to Tate (l6, p. 71) 
how small is a difficult question to answer. There is general agree- 
ment that when the degrees of freedom is larger than two, fewer than 
twenty percent of the cells should contain an expected frequency of 
less than five. No cell should contain an expected frequency of less 
than one. However, according to Tate (^0, p. 71) there is considerable 
evidence that this requirement is too high if: 1) there are two or more 

degrees of freedom or 2) the expected frequencies over the entire table 
average out to more than five per cell. 
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Tl ^ intention here is not to eirphasize the sheer significance of 
any test* Rather^ in agreeinent with Hayes (8, p* 61U), an atteD5>t 
was made to appraise the strength of the relationships presented. 

Ont of interest, design, and necessity all conclusions are based on 
the apparent predicta.ve relationships in the data. 

Tests for Independence of Classes 

These tests determined if the classes of students that were 
randomly sampled were from the same or identical' populations. Scores 
cn ITSD Test 6 were used to conpare the classes. As indicated in Table 
hi students were grouped by thirds on the basis of percentile ranks on 
los-za Norms. Note that for all contingency tables the numbers in 
parentheses represent e:iq)ected values ifhereas those vrLthout paren- 

p 

theses represent observed frequencies. !Rie folloTring% test for k 
independent sair^Jl-es was en^jloyed in determining independence (1^, 

p. 175): 

TC^ 5 df = (k-1) (r-1) (1) 

C ij 

Since p<.30 is greater than = .0^, the null hypothesis was re- 
tained, i.e., the classes were independent with respect to scores on 
ITED Test 6 . The diversity exhibited by the classes respective of 
the distidbution of scores by levels was attributed to random sampling 




error. 
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Because of the limited power associated with the test> it 
was decided that the significance of the diversity be assessed using 
the Kruskal-WaHis one-way analysis of variance by ranks. This test ' 
assiimes that the variable being assessed has an underlying continuous 
distribution which can be measured at least ordinally. It should be 
stressed that the Xruskal-Wallis test has asynptotic efficiency of 3>4r 
® 95*5 percent with respect to the F test as described by Andrevrs in 
Siegel ( 15 ^ p* 19U)* Using formula (2) as given in Siegel (15> p* 135) 
the corrected H for the data in Table U was 8.273. Respective of the 

df value associated with this statistic, the decision to retain the null 

hypothesis was verified although the value approached rejection at the 
♦ 

0.05 level# Hiis value is actually closer to the 0.10 level of sig- 
nificance. 

It appears from the contingency table in Table h that the class- 
es do not reflect a iioswial distribution with respect to ITED Test 6 . 
Rirthermore, the frequency distribution of scores exhibited a skewed, 
bimodal foim. The significance of the discrepancy from the normal 
distribution was determined by a'X" Goodness of Fit test (17, pp. U83-U8U). 
The hypothesis that the parent population was normally distributed was 
retained at the 0.05 level of significance (X^ = IU. 69 , df = 10), 

Hor^rever, the test is somewhat insensitive to skeimess and kur- 
tosis because of its failure to regard the signs of the discrepancies. 



TABLE U 

DISTRIBUTION OF STUDET4T SCORES 



ON ITED TEST 6 (lOVJA NOPJ'IS) FOR EACH CLASS 



ITED 

Level-i*- 


A 


3 


Class 

C 


D 


E 


Total 


High 


13 

( 9.2) 


11 

( 9.2) 


8 

( 9.2) 


7 

( 9.2) 


7 

( 9.2) 


U6 


Middle 


6 

( 6.0) 


3 

( 6.0) 


5 

( 6.0) 


8 

( 6.0) 


8 

( 6.0) 


30 


Low 


( i.8) 


CO 


7 

( U.8) 


5 

( U.8) 


5 

( U.8) 


2U 


Total 


20 


20 


20 


20 


20 


100 



•K-by thirds based on percentile rank 

= 10.63 
df = 8 
p < .30 

Decision: Cannot reject the null hypothesis 
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When the signs of the discrepancies appear to e^diibit a pattern, "016 
test is not appropriate and the more sensitive alpha statistics 
should be used (17, p. I48U). To test for nonnal distribution of the 
parent population, it is necessary to compute the<x^ andoc^ values 
(17, pp. 180-131) and compare the values i^ith the tabled values for 
a normally distributed population (17, Table G)* According to Tate 
(17, p* liU7) the assumption of nomial distribution is in doubt if 
either value is significantly large. This assuiJQ)tion was re- 
flected for these data. Althou^ non-normal peakedness was not so 
sufficiently large as to discredit the hypothesis of normal distribu- 
tion, this hypothesis was in doubt because of skevmess to the left 
as evidenced by the®^^ value. The rationale developed in the above 
paragraphs precluded the use of parametric tests in the analysis of 
data (15, pp* 19-20). 

Tests of Independence With Respect to Total ACTS Inventory Scores 

Table 5 illustrates the test for independence befeieen classes 
when conpared on the basis of ACTS Inventory scores. The frequency 
distribution of scores on the ACTS Inventory for the combined classes 
was divided into thirds. Because of the disproportionate number of 
students at the extremes of some of the levels, students with border- 
line scores were randomly assigned to either of the adjacent levels 
to effect equal representation in the levels. The decision to retain 
the null hypothesis was evidenced by the indicated value. The 
obvious discrepancy beti^een the expected and observed scores for 

o 
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TABLE 5 

DISTRIBUTION OF STUDENT TOTAL SCORES 



ON ACTS TEST FOR EkCR CLASS 



ACTS 

Level* 


A 


B 


Class 

C 


D 


E 


Total 


High 


11 

( 6.6) 


7 

( 6.6) 


6 

( 6.6) 


h 

( 6.6) 


( ^6) 


33 


Middle 


5 

( 6.8) 


5 

( 6.8) 


U 

( 6.8) 


11 

( 6.8) 


9 

( 6,8) 


3h 


Low 


1: 

( 6.6) 


8 

( 6.6) 


10 

( 6.6) 


( 6.6) 


6 

( 6.6) 


33 


Total 


20 


20 


20 


20 


20 


100 



*by thirds based on frequency distribution of ACTS scores 

= 13.33 
df = 8 
p< .20 

Decision: Cannot reject the null hypothesis 



Class A students at the high ACTS Inventory level may be, in part. 



a reflection of the somevihat higher proportion of students in this 
class who were in the highest third on ITED Test 6 « Since there 
were only 20 students in each class, the iise of an ITED Test 6 
control as a third dimension in a X ^ test was negated* One would 
suspect on the basis of this outcome that instruction in biology as 
defined for each class does not alter achievement as measured by the 
ACTS Inventoiy * 

By ei^loying the Kruskal-V7allis test to the above, it was found 
that there is indeed no statistically significant difference at the 
0*05 level in average scores on the ACTS Inventory for the five class- 
es* !IJie corrected H value was 7»hh7» The probability is less than 
0*10 that this H value would be obtained in the case of a true null 
hypothesis* 

As an assessment of concurrent validity Table 6, showing the 
relationship between ITED Test 6 scores and ACTS Inventory scores, 
was generated* Ihe corresponding probability figure is less than 0*001, 
and the hypothesis of independence was strongly discredited* That is, 
if a student *s score is in the high level for the ACTS Inventory it is 
highly likely that his score on ITED Test 6 will also be high* This 
same pattern is also in evidence for the low levels* In cases such 
as this where the null hypothesis was rejected a contingency coeffi- 
cient was calculated as a measure of predictive association* This val- 
ue was derived by using formula (3) as cited in Siegel (15, p* 197)* 
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The contingency coefficient C approximates as the number of 
categories for each variable increases. For a 3 X 3 table as used 
in Table 6 tlie computed C cannot exceed 0.816. Thus, the C value 
for Table 6 of 0.556 is respectably? high. Since both tests ( ACTS 
Inventory and ITED Test 6 ) measure aspects of critical thinking in 
science, a high C value is to be expected. It does support the 
concurrent validity of the ACTS Inventory . As evidenced by the 
extreme cells in Table 6, this coefficient could be assigned a 
plus sign. 

Tests of Independence for ACTS Inventory Scores by Cognitive Category 

A series of tests were run using forniola (l) to determine 
independence of the classes respective to their distribution at 
each cognitive category. A summary of the probability figures asso- 
ciated v±\h each of these tests is given in Table 7. 

For Cognitive Category 1 (Knowledge) performance by level varied 
from class to class. In studying the contingency table it was re- 
vealed that the contingency coefficient was positive. Its value was 
calculated at C = .376. Several cells in this contingency table 
contributed disproportionately to the % value. The researcher was 
suspect of this outcome because Class E was at the time of the inter- 
views enrolled in a biology course. Thus one would predict, as con- 
firmed from the data, that this class would outperform the other 



TABLE 6 



DISTRIBUTION OF STUDENT TOTAL SCORES 



ON ACTS TEST BY LEVEL ON ITED TEST 6 



ACTS 

Level-3*- 


High 


ITED Level# 
Middle 


Low 


Total 


High 


22 


7 


2 


31 




( 9 . 9 ) 


(11.2) 


( 9 . 9 ) 




Middle 


9 


15 


7 


31 




( 9 . 9 ) 


(11.2) 


( 9 . 9 ) 




Low 


1 


11» 


23 


38 




(12.2) 


(13.7) 


(12.2) 




Total 


32 


36 


32 


100 



*by thirds based on frequency distribution of ACTS scores 

#by thirds based on frequency distribution of standard 
scores on ITED 

■X2 = UJ..7U 

df = U 

p < .001 

Decision: Reject the null hypothesis 

C * .^56 




TABLE 7 



ASSOCIATED PROBABILITY FIGURES 
FOR EACH COdNITIVE CATEGORY VfflEN 
PERFORI'IAUCS ON THE ACTS TEST 
AMONG CLASSES V/AS COMPARED 



Cognitive 

Category Probability Figure Decision .05 



Knowledge 


p < .02 


Reject 


Application 


p <.20 


Retain 


Collection of Data 


p <.80 


Retain 


Analysis of Data 


p < .30 


Retain 


Synthesis of Data 


p <.5o 


Retain 


Withholds Judgment 


p < .20 


Retain 


Evaluation of Data 


p <.10 


Retain 
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classes for the IQiovjledge category. The data further revealed that 
students from BSCS classes (Class A) or students with teachers using 
BSCS philosophy and rationale (Classes A and B) nay out perfom 
students with more traditional biology backgroimds. One could suggest 
that enphasis by a BSCS teacher on developing the structure of the 
discipline may tend to promote retention of facts, concepts and 
principles. Such a premise, hoxiever, would not guarantee that all 
students with BSCS training retain more knowledge. Possibly only the 
better students as defined by ITED Test 6 can profit in this respect. 
Furthermore, there was little difference in distribution for Classes 
C and D. This was to be expected since both classes were t au ght by 
teachers using siinilar philosophy and teaching strategies. Both 
were also non*~BSCS classes (refer to Tables 2 and 3)* Deletion of 
Class E from the contingency table resulted in failure to reject the 
null hypothesis. This borderline test indicates that further research 
is needed to confirm the results. 

Note that for all the remaining categories represented in Table 7 
the null hypothesis was retained. At first glance this suggests that 
biology background, independent of other measures of achievement or 
aptitude, does not alter performance significantly at the higher 
cognitive categories. This is especially true for the Collection of 
Data category for which the probability of obtaining such a value 
under a true null hypothesis was about 0.80. At the other extreme the 
ability to evaluate data approaches rejection of the null hypothesis 
at the .05 level of significance. Here theX^ value is significant 
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at the 0.10 level. It may well be that a teacher using BSCS strategies 
could favorably influence a student *s ability to evaluate. The range of 
p values given for these tests reflects trends that should encourage 
further research in this area. The data further suggest, because of the 
range in p values, that factors other than intelligence or native ability 
are operational. 

Bie imderlying nvHl hypotheses reflected in Table 8 considered 
the independence of contiguous cognitive categories respective of other 
contiguous categories. The research hypothesis asked is whether stu- 
dents who do well on adjacent categories of cognition also do well on 
other combinations of adjacent categories. 

2 

Table 8 represents a three dimensional modification of the % 

test for independence. Formula (U) as referenced by Tate (l6, pp. 7U- 

2 

75) was used for calculating the 'jC . Here expected frequencies are 



calculated by finding the produco of the three marginal totals for 
each cell and dividing by the square of the grand total. 

In Table 8 the null hypothesis was strongly rejected since the 
corresponding probability figure was less than 0.001. This was fur- 
ther substantiated by the positive contingency coefficient of C * .53U. 
Because rejection was indicated two dimensional tests that conroared 
separate pairs of contiguous categories were performed. Rejection of 
the null hypotheses was al.so suggested in each case. 




; df * abc - (a + b + c - 2) (U) 
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TABLE 8 



DISTRIBUTION OF STUDENT SCORES ON COGNITION 
CATEGORIES 1 AND 2 RELATIVE TO SCORES ON COGNITION 
CATEGORIES 3 AND U, AND COGTHTION CATEGORIES 5 AND 6 



Categories 




Categories 3 and h 




1 and 2* 




High ^ 


Low 




Cats. 


5 and 6 


Cats. 5 


and 6 




High 


Low 


High 


Low 


High 


23 


5 


9 


8 




( 9.2) 


( 9.2) 


(13.3) 


(13.3) 


Low 


li 


9 


lU 


28 




(11.3) 


(11.3) 


(16.2) 


(16.2) 



Totals: Categories 1 and 2 High, hS 

Lov7, 55 

Categories 3 and U High, hi 

Low, 59 

Categories 5 and 6 High, 50 

Low, 50 

Grand Total, 100 



♦levels determined from frequency distributions of 
the sums of the scores on each combination of 
categories 

= 39.97 
df = U 
< .001 

7 

Decision: Reject the null hypothesis 
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A scries of tests were conducted to deteirdne the extent to 
^ich performance on one cogriitive category is predictive of per- 
formance on the remaining cognitive categories combined# The C 
values associated with these tests is given in Table 9« All 
null hypotheses were rejected. Tliis suggests that penoimance at 
any one level is predictive of performance on the entire ACTS 
Ihventoiy # One is then led to speculate that each category contains 
items varying in difficulty# Performance on Category 2 (Application) 
was the best predictor of success on the total inventory. Possibly 
this category is the least affected by student background and con- 
comitantly most influenced by the general ability of the students. 
Unfortunately contingency tables could not be used to assess perform- 
ance by class conqparisons or by level conqparisons on ITSD Test_6 
with any degree of confidence since the calculated expected fre- 
quencies in most instances did not fulfill the minimum requirements. 
In future research^ T^ere larger samples are enroloyed, such conpari- 
sons should further elucidate the reasons for rejection of the null 
hypotheses found here. 

Estimates of Item Difficulty, Discriinination and Test Reliability 

Table 10 illustrates the item difficulty for the h2 item ACTS 
Inventory# Note that although not rectangular in distribution, a 
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considerable spread in difficulty is evident. The values represented 



TABLE 9 



CONTINGENCY COEFFICIENTS FOR TESTS COMPARING 
SCORES ON EACH COGI’JITIVE CATEGORY 
OF THE ACK TEST WITH SCORES ON THE 
REMAINING COGNITKE CATEGORIES 



CATEGORY C0>5PARI30NS CONTINGENCY 

COEFFICIENT 

VALUE 



1 


VS 


2-6 


C = .U22 


2 


VS 


1,3-6 


c = .551 


3 


VS 


l,2,U-6 


C = .377 


U 


VS 


l-3,5,6 


C = .U82 




VS 


l-ii,53,6 


C = .U30 




VS 


l-h,5A,6 


C = .35h 


6 


VS 


1-^ 


C = .U20 



*refer to Table 1 for category designations 

"^^all values were positive based on the 
associated contingency table for each 



TABLE 10 



DIFFICULTY INDICES DISTRIBUTION 
FOR THE TOTAL ACTS INVENTORY 



* 

Interval 


Number 


Percent 


Item Numbers 


85 - 89 


1 


2 


25 


CO 

1 

o 

QO 


1 


2 


18 


75 - 79 


5 


12 


12, 21, 30, 38, LO 


70 - 7lj 


1 


2 


3 


65 - 69 


3 


7 


li, 20, 32 


60 - 6h 


1 


2 


29 


55 - 59 


1 


2 


22 


0 

1 


1 


2 


15 


1*5 - li9 


2 


5 


2, 11 


1*0 - Jili 


3 


7 


17, 19, 27 


35 - 39 


2 


5 


6, 39 


1 

O 


1 


• 2 


3U 


25 - 29 


1 


2 


16 


20 - 2U 


5 


12 


1, Ih, 23, 33, 37 


15-19 


5 


12 


8, 10, 31, 35 


10-11, 


5 


12 


9, 2U, 26, 28, U2 


5-9 


1 


2 


36 


0-1, 


3 


7 


7, 13, hi 



Smaller index values represent greater difficulty 



27 




were dctemojisd by the nuinber of s'tuden'ts getting ench item correct 
divided by the nuinber of students. Thus, an index of difficulty is 
an expression of the nuinber of correct responses for an item. Unfor- 
tunately, test construction has been dominated by theories and prac- 
tices that stress the identification and measurement of individual 
differences (13)« Items at the 50 percent level of difficulty are 
most effective in discrimination. Such a spread in difficulty as 
exhibited here has recently been recommended by lyier (18) as a goal 
in developing better measuring instruments. 

Table 11 the item difficulty is e^ressed by cognitive cate- 
gory. As e3Q)ected from the previous discussion the categories do 
exhibit a wide spectrum of difficulty >iith respect to the items in 
them. Ihis trend, however, is not as apparent for the higher cog- 
nitive categories, i.e., U, 5A and 6. Items in Category 5B (Reserves 
Judgment), required only a yes or no response. The guess factor 
present here could account for the diversity of difficulty shoim. 
Respective of the other higher categories, especially Category 6 

(Evaluation), it could be concluded that criteria so classified are 

\ 

less likely to be satisfied sinroly because, by definition, they 
represent the highest level of cognition. 

Ihe split-halves method was used to calculate indices of dis- 
crimination. They were determined by finding the difference in the 
proportion of correct responses betireen the groups of students scor- 
ing in the top 27 percent on the total ACTS Inventory and the bottom 
27 percent (6, p. 352) Positive values indicate that high scoring 



TABLE 11 



DIFFICULTY INDICES DISTRIBUTION 
FOR THE SEVJW COGNITIVE CATEGORIES 
OF THE ACTS INVENTORY 



Index 

Interval 


1 


2 


Cognitive Category 
3 a 5A 5 b 


6 


85 - 


89 






25* 










80 - 


8U 












18 




75 - 


79 


12 


30 


Uo 


21 


38 






70 - 


7U 




3 












65 - 


69 




20 


32 






h 




60 - 


6h 








29 








55 - 


59 






22 




- 






50 - 


5U 




15 












U5 - 


U9 




2, 


11 










Uo 


liU 




19 


17 






27 




35 - 


39 


6 






39 








30 - 


3U 




3U 












25 - 


29 




16 












20 - 


2U 




1 


Ih 


37 


33 


23 




15 - 


19 


31 




8 




10 


35 


5 


10 - 


Ih 




9 




26 


28 




2h,U2 


5 - 


9 














36 


0 - 


h 


7 






13 


hi 







*item munbers are recorded in the table 
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students ansi^ered the item correctly more frequently than low 
scoring students* The discrimination value approaches one as the 
relative difference increases. Table 12 shows the discrimination 
for each item on the total ACTS Inventory and for each cognitive 
category respectively. Interestingly, no negatively discriminating 
items are evidenced. In other words, there were no items on which 
students ^^dth low scores outperformed students with high scores. 

As a final note, the reliability of the ACTS Inventory as com- 
puted by a modification of Flanagan R as cited in Ebel (6) was 0.86. 

A minimum value of 0*70 is recommended for such measures of reliability* 

Conclusions 

An instrument was developed that measured cognitive performance 
in biology at seven levels of cognition. Based on results obtained 
from the ACTS Inventory, the criterion measure extracted from the in- 
strument, and ITED Test 6 , a measure of critical thinking ability in 
science, the folloidjig conclusions were formed. They reflect the val- 
idity of the instrument and suggest directions for future research. 

1) The five classes tested did not differ with respect to per- 
foimance on the total ACTS Inventor:^^ , i.e., the ratios of students 
performing at the high, middle, and Im levels for the ACTS Inventor;^’' 
did not vary significantly from class to class. Thus, the null hypo- 
thesis could not be rejected. VAien total ACTS Inventory scores served 
as a criterion measure, students taught by BSCS teachers or by teachers 



TABLE 12 



DISCRIMINATICN INDICES DISTRIBUTION 
FOR THE SEVEN COa^ITIVE CATEGORIES 
OF THE ACTS INVENTORY 



Index 

Interval 


I 


2 


Cognitive Category 
3 U 5a 


5B 


6 


.70 - .79 






ll4* 


29 


10 






.60 - .69 


6,31 


19 


22 


21 


33 


U,18 




.50 - .59 




1,11 

15,20 

3h 


32, UO 


37,39 


28,38 


' 


2U,U2 


.UO - ,h9 


12 




17 


13 




23,27 


5,36 


.30 - .39 




2,16 

30 












.20 - .29 


7 




8,25 


26 


Ul 






.10 - .19 




3,9 
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item numbers are recorded in the table 
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using BSCS philosophy and rationale do not perform significantly 
better than students taught by non-BSCS teachers who do not use BSCS 
philosophy and rationale. It should be noted that in most cases the 
ACTS Inventory was administered at least three months after the stu- 
dents had corpleted their biology course. 

2) Perfonriahce of students by level on ITED Test 6 was associated 
with performance of students by level on the total ACTS Inventory . Thus, 
the null hypothesis was rejected. 

3) With one exception when levels of performance on the individ- 
\ial cognitive levels -were used to con 5 )are the five classes, the null 
hypotheses could not be rejected. The null hypothesis was rejected 
for Category 1 (l&ioiTledge). However, when the class presently taking 
biology was deleted, the null hypothesis .could not be rejected at the 
0.05 level of significance although the classes could be considered 
dependent at the 0.10 level of significance. This same level of signif- 
icance “was found for the 'X value associated mth the table coirparing 
performance by classes on Category 6 (Evaluation). The greatest 
independence among the five classes was shoim for Category 3 (Collec- 
tion of Data). 

ii) Performance by levels for pairs of contiguous categories of 
cognition for the ACTS Inventory in the assumed hierarchy of cognition 
was related to performance on <^th§r contiguous pairs of cognitive cat- 
egories. Thus, the null hypotheses were rejected. Students who per- 
formed at one level on adjacent categories of cognition tended to per- 
form at the same level on other combinations of adjacent categories. 
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Althou^ diversity in the range of association existed, it was 
apparent that performance by students was consistant throughout 
contiguous pairs of categories. 

5) Perfoimance by level at any one cognitive category of the 
ACTS Inventory is related to performance by level on the remaining 
cognitive categories combined. Ihat is, the underlying null hypothe- 
ses of independence were rejected. Performance on Category 2 (Appli- 
cation) was the best predictor of success on the total ACTS Inventoiy . 

6) The test analysis data for the ACTS Inventory corroborated 
the above findings that performance by levels for one or a combination 
of ACTS Categories is independent of performance on other categories. 
This could in part be attributed to the spread of difficulty exhib- 
ited for most of the cognitive categories, i.e., items in most cate- 
gories ranged from easy to difficult. This trend was less evident for 
the higher cognitive categories. No item on the ACTS Inventory had a 
negative index of discriraiination. The reliabi?_ity for the ACTS Inven- 
tory , determined by a modification of the Kuder-Richardson formula 
was 0.86. 

In the light of the above findings it is evident that the scheme 
deve3.oped here for evaluation in biology has a multiplicity of appli- 
cations. A propensity tovjard administration of the data collecting 
instrument via computer assisted instruction is indeed evident. The 
technology in this area has already been developed. Large scale test- 
ing could well provide the means for assessing the purported goals of 
science education. 
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APPENDIX 



ASSESSMENT OF COGNITIVE TRA1>JSFER 
IW SCIEI^CE INVENTORY 
(With Explanatory Notes) 



Elicits through direct and indirect questioning three conponents 
of the petri dish related to the growth of living organisms* 
(1-A2, or 1-Al;)^- 

After examining the petri dish the student should list a mini- 
mum of three conponents of the solid material in the bottom of 
the dish* The folloiTing are deemed ^propriate: agar, gelatin- 

like substance, food, protein, minerals, starch, sugar, vita- 
mins, water, nutrients, carbohydrates, fats, lipids, organic 
matter, etc* 

States that "Uie dishes allow for a free exchange of gases* 

(1-B1, 1-B2, or I-B 3 ) 

States that the dishes provide an environment for groid.ng pure 
cultures* (Cover prevents entrance of other organisms*) 

(1-C1) 

Bie student*! statement should reflect the fact that he is 
aware that when such dishes are left with the cover off spores 
from the air can enter and grow* 

Reserves judgement when asked if the same organism is groining 
in each dish* (261 ) 

To satisfy this criterion the student must indicate that he is 
not sure - that the same thing is grosn-ng on each plate* This is 
of course dependent upon his conception of "same" thing" * All 
such decisions in the ACTS Ihventory~ are to be classified as 
Reserve Judgement* The evaluator need not concern himself 
with these criteria* 

Gives reason for reserving judgement* (2-B2a, 2-B2b, or 2-B3) 

To satisfy this criterion the student must state that he does 
not have adequate infoirmation to make a decision* By stating 
that he is not sure because the organisms on the two" plates 
look different does not satisfy the criterion* 



^Refers to frame designations in interview sequence* 
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6* Given exanples thereof recalls the teim species* (2-Cl) 

7* Defines species in terms of reproductive isolation. (2-C2 or 
2-07) 

The student *s definition must include a cor^jonent that suggests 
that his concept of species encoiroasses reproduction isolation, 
i. e. , he might state that to be of the same species organisms 
must be able to reproduce (reproduce fertile offering)* He 
does not satisfy this criterion by stating that species look 
similar or ^at they live in the same surroundings. 

8. Suggests a procedure to determine if the plates contain the 
same species. (2-03, 2-05, or 2-C6) 

At this point the student has elicited or has just been given a 
definition of species. To satisfy this criterion the student 
should state what one could mate or cross the materials in 
each dish. 

9* Predicts idiat results would be needed to confirm hypothesis. 
(2-Oi;) 

The student has either elicited or has just been told that one 
approach would be to mate or cross the materials in the tivo 
dishes. Ihe student should respond, by stating that he would 
look for fruits or offspring. 

10* Relates the presence of fruits in Situation III to the experi- 
mental design of Situation II stating that the presence of 
fruits in the experiment would confirm the hypothesis. (3-A1 ) 

To satisfy this criterion the student should state the structures 
he sees here might be the offspring produced the mating of the 
tiTO organisms in Situation II. 

11. VJhen given additional information satisfies Criterion 10 above. 
(Automatically satisfies this criterion if Criterion 10 is 
satisfied.) (3-A2b) 

All students are now- given frame, 3-A2. If the student did not 
satisfy Criterion 10, he is not asked frame 3-A2b. To satisfy 
"this criterion requires -the same response as Criterion 10 
required'. 

12. When given an example thereof states a definition of the term 
hypothesis . (3-Aii) 






37 



A satisractoiy response would include a phrase that suggests 
subsequent verification by experimentation, A sai?5)le diinition 
of hypothesis is given in Frame 3-A6* 

13* Refines previous experiment by en^jloying the stated definition 
of ^ecies in order to confirm that Ihe two plates contain the 
same species, (3-A5) 

A statement analogous to that given in Frame 3-A8 would satisfy 
this criterion. 

1U, Given additional inforrration satisfies Criterion 13 above, 
(Automatically satisfies this criterion if Criterion 13 is 
satisfied, ) (3-A7) 

Interprets results of experiment by” relating to preformed hypo- 
thesis. (3-B2) 

Here the student's response must include a statement regarding 
the fertility of the initial offspring, 

. 16. Ejq)lains the failure to produce fruits when two spores from two 
separate fruits are mated, (3-C2) 

17* States (3) three morphological differences beti^een plates of 
Situations III and IV, (5-A2 or 1 j-A5) 

18. Reserves Judgment. (lt-Bl) 

19* Provides evidence to support given assumption, (li-B2 or U-BU) 

20. Asks (3) relevant questions regarding enviommental factors in 
idiich the-plat5s-T?ere grown. (4-0.17 1FC5, or h-c6) 

See Frame 1|-C6 for exairple. Other questions may include such 
factors as; light, temperature, position of plates, humidity, 
added chemicals or atmosphere in plates, 

21 . Focuses on nevi problem by designing an experiment to deteimine 
the. effect of light on fruiting, (5-A1) 

Here the student is asked to focus on a new problem. To satisfy 
this criterion the student ruust state that he would grow some 
plates in tne light and some in the dark. In addition he must 
include any one of the folloviing components or any other relevant 
component, 

1 , continuum of light intensity, quality or direction. 
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2* gGnstic contajiuily of organisns iisGd* 

3. two or ncre variables simultaneously, one of idiich is light. 

4. all vanaoles constant except light. 

grown in the light produce fruits. 

statement should be included as part of the student's 
reruj--:s. Any alternative response must be directly 
related to cne eifects of light on fruiting, i. e., his observa- 
lons musTi ce relevant to the stated design of the esq^eriment. 

23. Reserves Judgment. (6-A3 or 6-A5) 

satisftdng 

proposing a specific mechanism 

toe S^toel'^rofTL^-^® activate 

SssSir^e fruit formation. 

caSJ^'^^ ^ syaJiesis of this enzyme could be controlled ohemi- 



25 . 



26 . 



27. 

28 , 



syaaents are asked to refocus their attention. The only 
K^f plates is in the relative 

state that the culture 

as larger or more dense than the one gro-,m in 

l^ablS^ t;!®"l-^i--l^“=es--rela^acmship between two 

^ l^snt and grovrth rate and hov; they are related 

to toe fruiting process. (6-B2) re-iatea 

th^^..^=^+^” a relationship involving 

“® ‘*°®s not satisfy this criterion by merely stat- 

^tots s"d 

growm IS no-s r«pid in the dark than in the light. 

Reserves Judgment. (6-B3) 

^elops alternative explanation for the production of fruits. 
Other than the sairole response the student could state that 
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possibly more time is required for fruiting to occur in the dark 
or that light triggers a sequence of chemical reactions that 
could also be acconrolished adding the right chemicals. 

29 • Given that an inhibitor to fruiting is normally produced tbis 
organism, e^ojlains the relationship between the inhibitor and 
light. (6-B5) 

30. Jhploys the principle of parsimony in deciding on the direction 
of further experimentation. (6-B?) 

31. Suggests Tvhat the inhibitor might be when given that it is a 
normal product of respiration. (6B8, 6-B9, or 6-BII) 

Given that the product might be a sinrole product of respiration, 
the student predicts what this product might be. Satisfying 
this crioericn would be premised on a fundamental understanding of 
the process of respiration.. 

32. In observing the results of this experiment states that the sealed 
plates in the light do not contain fruits. (7-Al) 

Although other observations are possible, this is the only one 
relevant to the stated experimental design. 

33 • Proposes relevant explanation for why sealed plates in the light 
did not produce fruits. (7-A2) 

3h» Given added infoimation satisfies Criterion 33 above. (Auto- 
matically satisfies this criterion if Criterion 33 is satisfied.) 
(7-A5) 

35* Reserves Judgment. (7A3) 

36. Gives reason for reserving judgment that involves removal of 
inhibitor. (Dependent upon satisfSring Criterion 35) (7 -Aha) 

37* Suggests procedure for determining if COo is the substance that 
inhibits fruiting. (7-B1 or 7-Bii) 

No added infoimation is given in Frame 7-Bl;. The problem is 
merely restated for the student. 

38. I^edicts that if a substance could be placed in the sealed plates 
• to absorb the GC‘2, fruits would be produced if CO2 was the 

inhibitor. (Automatically satisfies this criterion if Criterion 
37 satisfied) (7-35) 
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39 • Given that KOH in solution is an effective CO 2 absorber, designs 
an e3qperiment to determine what factors influence fruiting. 

(7-B2 or 7-B6) 

To satisfy this criterion the student must design an e 3 qperiment 
using KOH in sealed plates in the light and in the dark. 

IjO. In rethinking previously stated relationship between light and 

fruiting states that light is not necessary for fruiting to occur. 



This statement corresponds to the observation that sealed plates 
containing KOH grovm in the dark contain fruits. 

% 

1|1. Provides relevant erolanation (mechanism) for the appearance of 
fruits by interrelating all data collected to this point. (8-A2) 

Here the student must elicit relationships involving all the 
components of the criterion. (See sair^jle resi^onse in intervievr 



To satisfy this criterion the student must state that possibly 
ti70 strains of the same species were grown on this plate. They 
grew toward one another and mated. The line of fruits was then 
produced at the point of contact between the compatible strains. 

satisfying this criterion the student must use several concepts 
developed throughout the context of the interview sequence. At 
the minimum performance level the student would state that two 
strains were mated and produced offspring. 



(8-A1 ) 



sequence) 




